What is Predictive about Partial Least Squares?
نویسندگان
چکیده
Partial least squares (PLS) estimation of path models has become very popular in IS research, as an alternative to covariance-based methods. PLS path modeling is often referred to as being useful for “predictive” applications. In this work, we investigate the predictive aspects of PLS path modeling and its relation to predictive analytics and predictive assessment. In particular, we compare it to neural networks, which share several similarities. We conclude that PLS path modeling (the dominant form of usage in IS) is primarily causal-explanatory in nature. Introduction Partial Least Squares (PLS) estimation has become a popular method in information systems research, especially when it comes to analyzing adoption and usage of new electronic commerce applications (e.g., Pavlou and Fygenson (2006)). It is most commonly used to estimate structural equation models in place of covariance-based approaches (e.g., LISREL) in order to overcome the latter’s challenges (“technically demanding, often resulting in analytical errors, and, even if modeled correctly, is not a complete solution”, Chin et al. 2003, p.191). In structural equation models, one or more of the variables is assumed to be unobserved (latent). Theory is used to generate a causal diagram that relates the different variables (observable and latent) to each other. Then, data are used to quantify the hypothesized relationships, and the estimated relationships are used to test causal hypotheses. Hence, at their core, latent variable models are causal-explanatory. There have been claims that PLS path modeling is “causal-predictive” rather than causal-explanatory (Jöreskog and Wold, 1982; Anderson and Garbing, 1988). In this work we investigate the predictive aspects of PLS in light of the distinction between causal explanation and prediction by Shmueli & Koppius (2009). In particular, we look at the three steps in PLS path modeling: model estimation, validation, and use. PLS and Neural Networks To highlight the predictive nature of PLS path modeling, we examine it against a classic predictive model that appears somewhat similar: neural networks (NN) (Hackl and Westlund, 2000). NN are popular in data mining for predicting an outcome from a set of predictors. They are considered a “blackbox" in terms of interpretability and have been highly successful in terms of predictive accuracy. Similar to the use of latent variables (LVs) in PLS, in NN there is a diagram that connects the observable inputs and outputs via a network of unobservable (hidden) nodes and the NN algorithm then estimates the weights on the arrows connecting the different nodes. However, unlike path models, there is no underlying causal theoretical model, and the hidden nodes are conceptually unimportant. Hsu et al. (2006) compared PLS and NN numerically on simulated and real data. The simulations yielded similar results, whereas the real data yielded similar LV scores but different path coefficients. Model estimation: Two sets of models are estimated: outer models relate each latent variable to its observed variables (the measurement model), and inner models relate the latent variables to each other (the structural model). Each model is estimated using OLS regression (thereby assuming linear relationships). The path coefficients are then obtained via correlation or OLS regression between subsets of the estimated latent variables and/or observed variables. Estimation is done iteratively, until the various parameters are stable, by minimizing the various sums of squared errors. In other words, the final model best fits each of the regression models. Neural net estimation is similarly iterative. However, the algorithm tries to directly minimize the error of the observed output variables. NN also allow non-linear relationships between nodes. In light of these differences, PLS estimation is more similar to causal-explanatory modeling than to predictive modeling. Model validation: Validation is achieved by comparing the resulting estimates to the theoretical expectations. In particular, if coefficients do not have the right sign, then the corresponding observed variables are typically removed (Tenenhaus et al., 2005). Popular model validation measures include communality and redundancy. When computed by cross-validation (blindfolding), they measure how well the latent variables predict the observed variables (input or output). In contrast, in NN and in general predictive assessment, one examines how well the observed input variables predict the observed output variables. Hence, blindfolding serves more as a verification of the integrity of the estimated model in fitting the theoretical relationship between the observed and latent variables, rather than evaluates practical predictive power. Model use: The estimated PLS path model is used for testing causal hypotheses related to the structural diagram. Software output includes estimated weights, loadings, path coefficients, and correlations between latent variables. These are then examined carefully and used for inference. Due to the lack of an explicit overall statistical model and the nature of PLS estimation, statistical inference is achieved by using bootstrapping. Finally, path models are typically used only for causal testing and not for predicting new observations. In contrast, typical NN output does not include any inference-related information beyond the weights, and the focus is on its predictive accuracy on a holdout set (and comparing the training and holdout performance to detect over-fitting). The model is then used to predict new data. In summary, the final use of the estimated path model differs markedly from that of a predictive model. Conclusions In line with the differentiation between explanatory-causal statistical models and predictive analytics by Shmueli & Koppius (2009), our preliminary conclusion is that PLS path modeling is a causal-explanatory approach. Although PLS has some similarities with the predictive neural networks algorithm, its heavy reliance on a causal theoretical model, the conceptual importance of the latent variables, its method for model validation and its final use deem it very different from predictive models and from practical predictive use. If predictive power is important to PLS modelers then more emphasis should be given to the relationship between the observed input and output variables and the ability of the former to accurately predict the latter. Changes to the estimation procedure can also put more emphasis on prediction: Tenenhaus et al. (2005) suggested using PLS regression in place of OLS regression for outer model estimation. PLS regression is a predictive, shrinkage-based method which is indeed geared towards prediction. Finally, this is work in progress, and we plan to illustrate thevarious aspects mentioned above via simulation, the results of which will be presented at SCECR. ReferencesAnderson J C and Gerbing D W (1988), “Structural Equation Modeling in Practice: A Review and RecommendedTwo-Step Approach”, Psychological Bulletin, 103(3), pp. 411-423.Chin W W, Marcolin B L, and Newsted P R (2003), “A Partial Least Squares Latent Variable Modeling Approach forMeasuring Interaction Effects: Results from a Monte Carlo Simulation Study and an Electronic-MailEmotion/Adoption Study”, Information Systems Research, 14(2), pp. 189-217.Hackl P and Westlund A H (2000) “On Structural Equation Modelling for Customer Satisfaction Measurement”,Total Quality Management, 11, pp. S820-S825.Hsu S-H, Chen W-H, and Hsieh M-J (2006), “Robustness Testing of PLS, LISREL, EQS and ANN-based SEM forMeasuring Customer Satisfaction”, Total Quality Management, 17(3), pp. 355–371. Jöreskog K G and Wold H (1982), “The ML and PLS Techniques For Modeling with Latent Variables: Historical andComparative Aspects”, in Systems Under Indirect Observation: Causality, Structure, Prediction (Vol. I),Amsterdam: North-Holland, pp. 263-270.Pavlou P A and Fygenson M (2006), “Understanding and predicting electronic commerce adoption: An extensionof the theory of planned behavior”, MIS Quarterly, 30(1), pp. 115-143.Shmueli G and Koppius O (2009), “The Challenge of Prediction in Information Systems Research”, Working PaperRHS 06-058, School of Business, UMD (http://ssrn.com/abstract=1112893)Tenenhaus M, Vinzi V E, Chatelin Y-M, and Lauro C (2005), “PLS Path Modeling”, Computational Statistics & DataAnalysis, 48, pp. 159-205.
منابع مشابه
Spectrophotometric Simultaneous Kinetic Determination of Iodide and Iodate Using Partial Least-Squares Calibration Method in a Single Kinetic Run
A rapid, sensitive and versatile kinetic method is presented for the simultaneous spectrophotometric determination of iodide and iodate by partial least-squares regression (PLS) using original and derivate data named as absorbance and rate data. The method is based on the catalytic effect of the cited anions on the reaction rate between Ce(IV) and As(III) in 2 mol l?1 sulfuric acid medium. The ...
متن کاملDetermination of 137Ba Isotope Abundances in Water Samples by Inductively Coupled Plasma-optical Emission Spectrometry Combined with Least-squares Support Vector Machine Regression
A simple and rapid method for the determination of 137Ba isotope abundances in water samples by inductively coupled plasma-optical emission spectrometry (ICP-OES) coupled with least-squares support vector machine regression (LS-SVM) is reported. By evaluation of emission lines of barium, it was found that the emission line at 493.408 nm provides the best results for the determination...
متن کاملPredictive model selection in partial least squares path modeling
Predictive model selection metrics are used to select models with the highest out-of-sample predictive power among a set of models. R 2 and related metrics, which are heavily used in partial least squares path modeling, are often mistaken as predictive metrics. We introduce information theoretic model selection criteria that are designed for out-of-sample prediction and which do not require cre...
متن کاملKernel Partial Least Squares for Stationary Data
We consider the kernel partial least squares algorithm for non-parametric regression with stationary dependent data. Probabilistic convergence rates of the kernel partial least squares estimator to the true regression function are established under a source and an effective dimensionality condition. It is shown both theoretically and in simulations that long range dependence results in slower c...
متن کاملDetermination of Protein and Moisture in Fishmeal by Near-Infrared Reflectance Spectroscopy and Multivariate Regression Based on Partial Least Squares
The potential of Near Infrared Reflectance Spectroscopy (NIRS) as a fast method to predict the Crude Protein (CP) and Moisture (M) content in fishmeal by scanning spectra between 1000 and 2500 nm using multivariate regression technique based on Partial Least Squares (PLS) was evaluated. The coefficient of determination in calibration (R2C) and Standard Error of Calibra...
متن کاملMethodology and Theory for Partial Least Squares Applied to Functional Data
The partial least squares procedure was originally developed to estimate the slope parameter in multivariate parametric models. More recently it has gained popularity in the functional data literature. There, the partial least squares estimator of slope is either used to construct linear predictive models, or as a tool to project the data onto a one dimensional quantity that is employed for fur...
متن کامل